Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Current PEFT methods for LLMs can achieve either high quality, efficient training, or scalable serving, but not all three simultaneously. To address this limitation, we investigate sparse fine-tuning and observe a remarkable improvement in generalization ability. Utilizing this key insight, we propose a family of \underline{S}tructured \underline{S}parse \underline{F}ine-\underline{T}uning (\textbf{\model}) methods for LLMs, which \textit{concurrently achieve state-of-the-art fine-tuning performance, training efficiency, and inference scalability}. \model \mbox{accomplishes this by ``selecting sparsely and computing densely". It selects a few} heads and channels in the MHA and FFN modules for each Transformer block, respectively. Next, it co-permutes weight matrices on both sides of the coupled structures in LLMs to connect the selected components in each layer into a dense submatrix. Finally, \model performs in-place gradient updates on all submatrices. Through theoretical analysis and empirical results, our method prevents overfitting and forgetting, delivers SOTA performance on both commonsense and arithmetic reasoning with 4.6$$\%$$ and 1.3$$\%$$ average improvements compared to LoRA, and surpasses full FT by 11.5$$\%$$ when generalizing to various domains after instruction tuning. Using our partial backpropagation algorithm, \model saves training memory up to 3$$\times$$ and improves latency by 1.5-2.7$$\times$$ compared to full FT, while delivering an average 10\% improvement over LoRA on both metrics. We further demonstrate that the weight updates in \model can be decoupled into adapters, enabling effective fusion, fast switch, and efficient parallelism for serving multiple fine-tuned models.more » « lessFree, publicly-accessible full text available December 10, 2025
-
The motility mechanisms of microorganisms are critical virulence factors, enabling their spread and survival during infection. Motility is frequently characterized by qualitative analysis of macroscopic colonies, yet the standard quantification method has mainly been limited to manual measurement. Recent studies have applied deep learning for classification and segmentation of specific microbial species in microscopic images, but less work has focused on macroscopic colony analysis. Here, we advance computational tools for analyzing colonies of Proteus mirabilis, a bacterium that produces a macroscopic bullseye-like pattern via periodic swarming, a process implicated in its virulence. We present a dual-task pipeline for segmenting (1) the macroscopic colony including faint outer swarm rings, and (2) internal ring boundaries, unique features of oscillatory swarming. Our convolutional neural network for patch-based colony segmentation and U-Net with a VGG-11 encoder for ring boundary segmentation achieved test Dice scores of 93.28% and 83.24%, respectively. The predicted masks at times improved on the ground truths from our automated annotation algorithms. We demonstrate how application of our pipeline to a typical swarming assay enables ease of colony analysis and precise measurements of more complex pattern features than those which have been historically quantified. An implementation of our work can be found on https://github.com/daninolab/proteus-mirabilis.more » « less
-
Face detection and recognition benchmarks have shifted toward more difficult environments. The challenge presented in this paper addresses the next step in the direction of automatic detection and identification of people from outdoor surveillance cameras. While face detection has shown remarkable success in images collected from the web, surveillance cameras include more diverse occlusions, poses, weather conditions and image blur. Although face verification or closed-set face identification have surpassed human capabilities on some datasets, open-set identification is much more complex as it needs to reject both unknown identities and false accepts from the face detector. We show that unconstrained face detection can approach high detection rates albeit with moderate false accept rates. By contrast, open-set face recognition is currently weak and requires much more attention.more » « less
An official website of the United States government
